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Abstract. In recent years, the Graphics Processing Unit (GPU) has emerged as a 
low-cost alternative for high performance computing, enabling impressive speed-ups 
for a range of scientific computing applications. Early adopters in astronomy are al- 
ready benefiting in adapting their codes to take advantage of the GPU's massively par- 
allel processing paradigm. I give an introduction to, and overview of, the use of GPUs 
in astronomy to date, highlighting the adoption and application trends from the first 
~100 GPU-related publications in astronomy. I discuss the opportunities and challenges 
of utilising GPU computing clusters, such as the new Australian GPU supercomputer, 
gSTAR, for accelerating the rate of astronomical discovery. 



1. Introduction 

For at least four decades from the 1960s, advances in traditional computation on single- 
core CPUs has been driven by increases in trans istor density and clock rate. This is 
seen through the well-established Moore's Law (IMoordl 19651) biennial doubling in the 
number of transistors per integrated circuit, and a corresponding increase in processing 
performance. In principle, it was possible to implement a code once, and achieve faster 
(approximately double) computation simply by purchasing new hardware, at lower cost, 
every two years. In practice, new generations of CPUs also provided additional bene- 
fits (such as increased system memory, improved caching, etc.), resulting in on-going 
algorithmic improvements and software updates. 

In the early 2000s, CPU clock-rates began to plateau - mainly due to manufactur- 
ing constraints, such as difficulties in keeping ever-faster CPUs sufficiently cool to work 
without melting. Further processing improvements, and the continuation of Moore's 
Law growth, were achieved by moving to multi-core solutions. Indeed, the likely future 
of CPUs is that they will become increasingly multi-core: codes or algorithms that can 
be expressed in parallel form will derive the most benefit from these new architectures. 
A preview of this highly multi-core future is available now in the guise of the many-core 
graphics processing unit (GPU). Leveraging advances in hardware that were designed 
to enhance and improve graphical performance in support of the many-billion dollar in- 
ternational computer gaming industry, GPUs have rapidly become credible alternatives 
for low-cost, massively parallel scientific computation. Astronomers have been quick 
to adopt GPUs as a powerful new component of their computational arsenal. 

Following early successes at speeding-up codes on single GPU systems and small- 
scale GPU clusters, a growing number of research institutions are now making major 
investments in significant high-performance computing (HPC) clusters deriving a sub- 
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stantial fraction of their (theoretical) peak processing performance from GPUs. At the 
dawn of this exciting new era of GPU-powered HPC clusters, what do astronomers need 
to know about GPUs in order to take advantage of new computational opportunities? 
What does a potential lOx or lOOx processing speed-up mean in terms of accelerating 
the rate of astronomical discovery? What lessons can we learn, and what trends can we 
identify, from the early adopters of GPUs in astronomy? And now that we have 0(100) 
Tflop/s GPU-clusters at our disposalQjust what are we going to do with them? 



2. GPUs for Scientific Computation 

In essence, the GPU acts as a computational co-processor to the CPU, a mode of op- 
eration not unfamiliar to computer programmers (and owners) of the 1980s who could 
opt to use a maths co-processor or floating point unit (FPU) to accelerate mathematical 
operations. While modern GPUs offer a much wider range of programmable capabil- 
ity than the earlier FPUs, they are not able to completely replace the CPU - nor are 
they likely to. In general terms, GPUs achieve their performance at the hardware level 
by trading off the large-memory caches and sophisticated control logic of CPUs (ac- 
commodating software solutions for activities as diverse as opening a file from a local 
disk, serving a web-page in a browser, and numerical processing for an astrophysical 
simulation) for circuit-area devoted to fast floating point computations. 

As the potential for us ing the highly-parallel GPU architecture for scientific com- 



putation became apparent (Venkatasubramanian 2003), the notion of general purpose 



computation on graphics processing units (GPGPU) began to gain momentum. Early 
attempts to utilise the increased computational performance of GPUs required program- 
ming i n sh ader languages [e.g. NVIDIA's C for Graphic s, Cg, was used by iRosa et all 
(2004) and Portegies Zwart, Bellem an. & Geldol (12007 )1. For graphics, the hardware 



processing pipeline is optimised to calculate red-green-blue (RGB) colours and alpha 
(A) channel transparency for pixels, vertices and polygons, achieved through the use 
of customised software fragment shader functions. Implementation of a scientific algo- 
rithm was only possible if it could be recast as a shader, often requiring storing data in 
structures that shared the "four floating point numbers" structure of RGBA. 

The advent of the Compute Unified Device Architecture (CUDAB) application pro- 
gramming interface (API) from NVIDIA and the open-standard alternative OpenCL0 
developed by the Khronos Group, have dramatically changed the usability of the GPU 
for general computation^ Indeed, certain GPU products from vendors, such as the 
Tesla series from NVIDIA, are sold with scientific computing in mind: architecturally 
equivalent to consumer graphics hardware, they lack the capacity to output graphics 
to a display device, but with increased memory spaces and provision for error correct- 
ing memory, which are not required by the home computer gamer. Moreover, while 
early generations of GPUs only supported 32-bit (single precision) floating bit compu- 



'We use the notation Gflop/s = 10 9 floating point operations per second and Tflop/s = 10 12 flop/s. 
; http://www.nvi.dia. com/cuda 
; http : //www . khronos . org/opencl 

For more details on GPU programming, see, e.g. iKirk & Hwd l201Cl) or lSanders & Kandrotl ( TfoToh . 
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tation, the higher-end solutions now also provide 64-bit (double precision) support at 
comparable processing speeds. 



3. Early Adopters and Emerging Trends 

One of the first astronomical problems adapted to GPU was acceleration of the N- 
body force problem, through computation of the Q( N ) pair-wise forces between parti- 
cles. Early GPU implementations wer e reported by iNvland et al.1 (120041). using C g and 



OpenGL on an NVIDIA GPU , while lElsen et all (l2006h and lElsen et all d2007l) used 



BrookGPU dBuck et al.l f2004b on an ATI X1900XTX card. Both groups found that 
the high arithmetic intensity of the force calculation was ideally suited to the GPU's 
architecture, and simple code optimisations could give speed-ups of more than 20x 
compared to existing CPU implementations. Moreover, they achieved computational 
perf ormance co mparable to the more expensive, custom GRAPE-6A hardware. 



iRosa et a"l . (2004) examined a real-time problem - recovery of the wave-front 



phase from a Shack-Hartmann sensor. Reporting on an implementation of the iterative 
Hudgin algorithm, they found a lOx speed-up for the centroid part of the calculation, 
but only a 2x speed-up overall compared to a CPU-only implementation. They demon- 
strated that peak performance on a GPU does require a sufficiently large problem - the 
CPU out-performs the GPU when there are insufficient processing tasks to keep the 
GPU pipeline busy. 



ISchaaf & Overeeml (120041) described a Common-Off-The-Shelf (COTS) correla- 
tor platform constructed from GPUs, with an eye on future, low-cost solutions scalable 
to the Square Kilometre Array (SKA). They achieved ~5x better performance (mea- 
sured as complex multiplications/second) for a 16x larger problem using an NVIDIA 
GeForce 6800 Ultra GPU, compared with a 2.8 GHz CPU. The price/Gflop and power 
usage/Gflop of the GPU were both about 3x better than for CPU. 

To examine some of the emerging trends in the adoption of GPUs in astronomy 
since these early projects, we perform simple bibliometrics using the SAO/NASA As- 
trophysics Data System Abstract Servicejj An abstract-only search on various combi- 
nations of the terms: GPU(s), graphics processing unit(s), CUDA, and OpenCL resulted 
in 94 distinct abstracts from 2004-2011 (as of 1 October 2011). There were no relevant 
abstracts in 2005. An attempt was made to remove duplicate items (e.g. papers that 
appear separately as an arXiv version and a final published version). 

There are, of course, limitations with such an approach. We fail to identify those 

publi cations that used GPUs, but did not declare this in the abstract (e.g. lFluke. Barnes. & Hassan 



2010) , and not all publications on astronomy-related GPU appl ications appear in ADS 



re.g. lHamada & Nitadoril d2010l) : ISpurzem. R. and others! (t2010h l. While additional de- 



tails on the API and specific hardware used could have been obtained from each of the 
publications, our intention was to obtain a quick snapshot of the current state of GPU 
development and the extent of early adoption in astronomy. We can answer questions 
such as "how are GPUs being used in astronomy?" (Figure 1) and "where are the results 
being published?" (Figure 2). 

Analysis of the abstracts reveals almost 50 unique computational problems in 30 
broad application areas, ranging from adaptive optics and algorithm analysis, data min- 



5 http : //adsabs . harvard . edu 
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ing and digital signal processing, plasma and protoplanetary disk simulation, to tree- 
codes and two-point correlation functions. The vast majority of abstracts present GPU- 
based codes or methods (82/94). In this context a "method" relates to a demonstration 
that a particular problem is suited to a GPU, and is often accompanied by a quoted 
speed-up (relative to a single-core CPU, or, in a small nu mber cases, a multi-cor e 
CPU implementation) or a peak processing p e rformance [e.g. [H amada & Iitaka (200 7|); 
iBelleman. Bedorf. & Portegies Zward(l2008l) : lGaburov. Bedorf. & Portegies Zwartl(l2010l) l. 
Of the remaining abstracts, 9 were clearly identifiable a s presenting new s c ientifi c 
results based on the use of an e xisting GPU code Te.g. lAubert & Tevssierl (120101) : 
Banerjee. Baumgardt. & Kroupal d201of) : iGreig. Bolton & Wyithd d201ll) l. and three 



dealt more general ly with the "philosophy" o f adopting GPUs for scientific computing 
in astronomy [e.g. iBarsdell. Barnes. & Fluke! (l2010l) 1. 
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Figure 1 . How are GPUs being used in astronomy? An ADS abstract-only search 
(1 October 201 1) on various combinations of the terms: GPU(s), graphics process- 
ing unit(s), CUDA, and OpenCL resulted in 94 publications (there were no relevant 
abstracts in 2005). The "other" category combines any application area with 3 or 
fewer abstracts. The year 2010 represents the commencement of wider adoption of 
GPUs by the astronomical community. 

As Figure 1 shows, the year 2010 marked the transition from early exploration of 
the capabilities and suitability of GPUs to a restricted number of problems, to one of 
widespread adoption across a broad range of application areas in astronomy (62 ab- 
stracts across 26 application areas since 2010 - the "other" category combines any ap- 
plication area with 3 or fewer abstracts). We anticipate that this trend will continue for at 
least the next few years, as the application market is far from being saturated. Amongst 
the early applications, Fourier transforms and pair-wise N-body forces were obvious, 
"low-hanging fruit", with straight-forward parallelism. Recent works are tackling mo re 
complex algorithms, such as general relativistic magnetohydrodynamics dZinkj|20lT ). 
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N-body simulations (and related methods) stand out as being the most popular tar- 
get for both methods and scientific result abstracts (18/94). The emphasis on scientific 
computing is clear, with only ten abstracts discussing visualisation/data analysis uses of 
GPUs - a number comparable with GPU-enabled signal processing for radio astronomy 
(9/94), adaptive optics (10/94) and hydrodynamics/magnetohydrodynamics (7/94). 




2004 2006 2007 2008 2009 2010 

□ Journal DarXiv ■Conference PhD 



2011 



Figure 2. Where is GPU -related work being published? Journals include New As- 
tronomy, Monthly Notices of the Royal Astronomical Society, Astrophysical Jour- 
nal, Astronomy & Astrophysics, Publications of the Astronomical Society of Aus- 
tralia; Conferences include SPIE and ADASS; the arXiv category includes publica- 
tions that are not clearly identifiable with one of the other categories. 



It is interesting to see where GPU-related work is being published - see Figure 
2. We identify four categories of publication outlet: journals, arXiv preprints, confer- 
ence papers and PhD theses. The arXiv category includes papers that have appeared 
on the arXiv, but are not clearly identifiable with one of the other categories (as of 1 
October 2011)0 The main contributors to the conferences category are SPIE (11/94) - 
mostly presentations on adaptive optics - and ADASS (6/94). The two main astronomy 
journals publishing refereed GPU-papers are New Astronomy (13/94) and Monthly No- 
tices of the Royal Astronomical Society (7/94). The important message from this is that 
journals are prepared to publish GPU methods papers, so get out there and turn your 
conference papers into more complete publications with details of your algorithms (and 
kernels), so that others can benefit from your experiences. 

Based on abstracts only, other trends were considered. The declared use of particu- 
lar programming APIs: Cg (2; none since 2007), CUDA (26; since 2008), and OpenCL 
(7; since 2010). 17 abstracts noted the use of a specific NVIDIA card, with Tesla/Fermi 



s e.g. papers that are identified as being in-press or accepted to a named journal are counted in the journal 
category, however, they are still counted towards the year in which they first appeared on the arXiv. 
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cards increasing in prominence (NVIDIA S1070, C1060, and C205 cards are identi- 
fied in six abstracts sinc e 2010). Only two abstracts named ATI cards dPang. Pen. & Perrone 



Elsen et al.H2007f) . At present, NVIDIA and CUDA do seem to dominate the sci- 



entific computing market share, which may be due in part to the more recent appearance 
of OpenCL as a general-purpose programming API suitable for ATI cards. 

Reported speed-ups, relative to CPU-implementati ons, ranged from 7x [computa - 
tion of the Fast-Fourier Transform for a daptive opti cs in lRod rfguez- Ramos et all ([2006)] 
to 600x [solving Kepler's equations in lFordl (12009 )], with several projects highlighting 
one-to-two order of magnitude improvements in perform ance [radio astronomy sig- 
nal processing - lHarris. Haines. & Staveley-SmrtrJ d2008); magnetoh ydrodynamics - 
Wong et al. (2009); cosmological lattice computations - ISainiol fcOlCh l. 



We should treat speed-ups with some caution: achieving performance at the high- 
est end (lOOx) may be an indication that a less efficient, existing CPU-implementation 

was being compared with a highly-optimised GPU solution. Moreover, a single preci- 

sion s peed-up is often more impressive than for double (or quadruple - see Ginjupalli & Khanna 
2010h precision, so consideration must be made to accuracy over performance. Devot- 



ing additional time to optimising existing CPU-codes is possibly not time well spen t, 
particularly if a "simple" GPU code remains faster [see examples in lFluke et all ( 201 1 )1. 
however, an investigation of the potential for parallelisation of an existing single-core 
CPU code can lead to simple speed-ups through the use of libraries such as OpenMFQ 
on multi-core CPU architectures. On the other hand, speed-ups reported even 1-2 years 
ago against single-core CPUs can comfortably be increased by a factor of a few: while 
GPU processing rates continue to grow, single-core CPU rates have stalled. 

The current record holder for a "workstation" GPU (i.e. not a cluster, but still 
allowing for multiple GPUs within one devic e) is 1.28 Tflop/s on a Tesla S1070 for 
computing direct ray-tracing for microlensing (Thom pson et al.ll2010h . As with speed- 
ups, flop-counts should be treated with caution as there can be a mismatch between 
operations and clock-cycles when any mathematical operator beyond addition, subtrac- 
tion or multiplication is used. 



4. GPU-Powered Clusters 

In terms of theoretical processing power, a single GPU can achieve the same processing 
performance as a modest CPU cluster - provided the problem can fit in the memory of 
a single GPU - and a small cluster of GPUs can outperform a C?(100)-node CPU-based 
cluster, with a vast reduction in the amount of inter-node network connectivity required, 
and at a fraction of the hardware and operating cost. In this context, a GPU cluster is 
really a hybrid CPU-GPU system, as GPUs cannot manage important tasks like reading 
data from disks or supporting networks. 

A growing number of astronomical institutions are now investing in major GPU- 
powered HPC clusters, with two of the first being the kolob compute clustei0 at the 



' http : //openmp . org/ 



f http : //kolob . ziti . uni-heidelberg . de/ 
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University of Heidelberg, and the Si lk Road facilit^ operat ed by the National Astro- 
nomical Observatories of China (see lSpurzem. R. and othersll2010l) . 

In 2010, the Australian astronomy community was invited to present expressions 
of interest to Astronomy Australia Limited (AAL) for new research infrastructure that 
could be funded by AAL through the Australian Federal Government's Education In- 
vestment Fund. One of the nine successful projects was gSTAR: the GPU Supercom- 
puter for Theoretical Astrophysics Research - a national high-performance computing 
facility for astronomers. Installation of the gSTAR facility at Swinburne University of 
Technology commenced in early September 2011. 

Phase 1 of gSTAR, with a theoretical peak of ~130 Tflop/s (single precision), com- 
prises 51 dual-socket compute nodes each with 2 GPUS (NVIDIA C2070; 6 GB RAM), 
with an additional 3 high-density GPU compute nodes containing 7 GPUs (M2090: 
6GB RAM). In excess of 1 Petabyte of usable disk space, supported by the Lustre file 
system, is available, and compute nodes are connected via QDR InfiniBand. Phase 2 
of gSTAR, scheduled for 2012, will see the addition of further GPUs. Early science 
on gSTAR is expected to include: high-resolution N-body simulations of star clusters, 
including incorporation of improved physics; a cosmological microlensing parameter 
survey; and e xtensions to the interactive, rea l-time visualisation framework for teras- 
cale datasets (IHassan. Fluke. & Barnesll201ll) . 

As of June 2011, 3 of the top 5 facilities (the National Supercomputing Centers 
in Tianjin and Shenzhen, and the GSIC Center, Tokyo Institute of Technology) on the 
TOP500 Supercomputing Siteg[3 achieve their benchmark status, in part, through the 
use of GPUs. These numbers are likely to rise in the next TOP500 list. Consulting 
the Green500 ListH which ranks HPC facilities based on their energy efficiency, 4 of 
the top 10 use GPUs (ATI Radeon for Nagasaki University and Universitaet Frankfurt; 
NVIDA for GSIC Center, Tokyo Institute of Technology and CINECA/SCA - Super- 
Computing Solution). Moreover, 14 of the Top 20 Green sites are accelerator based, 
using either GPUs or IBM Cell-based processors - further evidence that GPUs can 
provide impressive energy efficiency. 

Astronomic al use of GPU clusters to date has included adaptive-mesh-rennement 
calc ulations (e.g . S chiv e. Tsai. & Chiuehll2010l) ; wavefront correction for adaptive op- 
tics dBouchez et al.ll206*9 ); spherical harmonic tran sforms for Cosmological Microwave 
Background comput ations (Szvdlarski et alj|201lh; further progress on high-resolution, 
/V-body simulations dSpurzem. R. and others! |2010|) ; and real-time, interactive volume 
rendering of terascale datasets (IHassan et al.ll201ll) . 

An innovative appro ach to gaining very high peak performance was achieved by 



Hamada & Nitadoril (120101) through the use of low-end, commodity graphics cards (576 



x NVIDIA GT200 cards). A very impressive sustained performance of 190 Tflop/s 
for a 3 billion-particle, hierarchical /Y-body simulation, on a HPC cluster costing just 
over $400,000 dollars, resulted in an honourable mention for the Gordon Bell Prize at 
Supercomputing 2010. This raises an interesting comparison between high-end science 
cards and commodity GPUs - the main difference is double precision performance, 



- http : //si lkroad . bao . ac . cn/ 



" jhttp : //www . top5S8 . org/| 
"http://www.green5Q0.org (June 2011) 
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availability of error correcting memory and total memory available. If these are not 
concerns, then a commodity card may be sufficient. 



5. Accelerating the Rate of Discovery 

There are many reasons why astronomers might b e excited about the prospects of adopt- 
ing GPUs. The most obvious benefits (see also iKirk & Hwull2010l) are the ability to: 

• Run an individual problem faster. Computing a problem in a few minutes com- 
pared to a few days, or a few days compared to a few months; allows real-time 
solutions t o computationally d e manding tasks, such as detection of transient ra- 
dio events IMag ro et all (l201ll) : lBarsdell et all (l201ll) 1. 



Run more problems in the same total wall ti me. Permits extensive e xploration of 



parameter space f e.g. black h ole inspirals - iHerrmann et al.1 (120101): solving K e 



pier's equations - Ford (2009); Lyman-o- forest simulations - iGreig et al.1 (1201 11) 1. 
This promises to be one of the most important new uses of GPU clusters, enabling 
greater understanding of the effects of initial conditions, and allowing statistical 
investigations rather than promoting over-analysis of a single simulation result. 

Solve a bigger problem size in the same wall time as a smaller problem size 
on a CPU-system. This permits working at higher/improved resolution or pro- 
vide greater capacity to explore evolution of systems over more time-steps; han- 
dle terascale, and ultimately pet ascale, image and s pectral data cube p rocessing 
dFluke et al.l2010l) . visualisation dHassan et al. and data mining (IProtopapasI 



I2010h . However, if the problem cannot fit within the memory of a single GPU, a 
great deal of communication may be required between nodes of a GPU cluster. If 
the bottleneck moves from computation to data transfer, then gains delivered by 
a processing speed-up may be lost until such time there is a corresponding speed- 
up in bandwidth (between nodes), an increase in memory bus size (between host 
and GPU), and a decrease in latency in the interconnect. 

• Solve a more complex computational problem in the same wall time as a simpler 
problem on a CPU-system. E.g. use a more accurate solution method, which 
may exhibit better stability, etc.; enable the inclusion of additional physical prop- 
erties (e.g. magnetic fields); opportunities to utilise/implement algorithms with 
improved accuracy rather than an increase in resolution or problem size. 

• Provide much lower price/performance compared to an equivalent CPU-based 
cluster. Provides potential for more astronomers to access Tflop/s high per- 
formance computing on the desktop, rather than needing to apply/compete for 
time on national or institution-level HPC supercomputing facilities for all comp- 
utationally-demanding processing. 

The move from traditional CPU systems to GPU-accelerated computation is not 
without challenges. Identifying, implementing, and optimising relevant algorithms for 
the highly-parallel GPU architecture can require a greater understanding of computer 
science fundamentals than many science professional possess: traditional sequential 
programming skills are arguably easier for "astronomer-programmers" to learn than 
parallel programming techniques. More importantly, code that has been developed 
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specifically for single-core CPU will not run on a GPU without substantial modification. 
In the short term, additional personnel time is required in order to develop GPU codes. 

In general, the best results on GPUs are seen for computations that exhibit a large 
amount of data parallelism (i.e. the same computation performed on many different data 
values) and high arithmetic intensity (i.e. a high ratio of floating point calculations to 
memory accesses). Learning to use a GPU effectively means gaining an understanding 
of a new range of programming tricks, including reducing branching conditions (if/then 
statements), making judicious use of over-computing (e.g. using zero-mass particles in 
the pairwise A^-body force calculation) to keep GPU threads busy, and giving more 
thought to memory access patterns. 

By placing the emphasis on the "total time to science" dFluke et al.ll201ll) . rather 
than time spent developing code for GPUs, some of this additional coding work should 
be made up by the typically lOx (or greater) processing speed-ups. As a growing 
number of GPU-programming and scripting libraries become available (e.g. PyCuda_3 
and Thrus{3), with a goal of improving developer productivity, the short-term need for 
new code development may be reduced. Interactive data languages such as IDL0 can 
also achieve acceleration through bindings to GPU libraries like GPULib0 bringing 
the potential for GPU-acceleration to non-C-programming astronomers. 



6. Concluding Remarks 

At the dawn of the petascale data era, astronomers will be faced with new challenges in 
data processing and computation. GPU-powered HPC clusters offer a low-cost oppor- 
tunity to explore new, scalable, massively-parallel algorithms. The processing speed- 
ups available with GPUs, for the right types of problems, are helping pave the way to 
new science, through higher-resolution simulations, improved physical modelling, and 
much greater exploration of parameter spaces. Ultimately, the goal of adopting any 
new hardware solution in astronomy should be to help improve and enhance our un- 
derstanding of the Universe. The future of computing for astronomy is here - and it is 
massively parallel. 

Acknowledgments. CJF has benefited greatly from GPU-related discussions with 
David Barnes, Ben Barsdell, and Amr Hassan. This research has made use of NASA's 
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